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ABSTRACT 

Under a Congressional mandate, the General Accounting Office 
(GAO) studied the costs of implementing tests that will be required under the 
No Child Left Behind Act (NCLBA) and the reauthorization of the Title I 
program. In passing the legislation. Congress increased the frequency with 
which states are to measure student achievement in mathematics and reading 
and added science as another subject. Congress also authorized funding to 
support state efforts to develop and implement tests for this purpose. Using 
data from many sources, GAO determined the characteristics of states* Title I 
tests and made estimates of what states may spend to implement the required 
tests. The study also identified factors that explain the variation in 
expenses among states. The study found that the majority of states do 
administer statewide tests and customize questions to measure student 
learning against state standards. The report contains three estimates of 
total expenditures between fiscal year 2002 and 2008, based on different 
assumptions about the types of test questions states may choose to implement 
and how they are scored. If all states use tests with multiple-choice 
questions that are machine scored, GAO estimates that total state 
expenditures will be about $1.9 billion. Increasing the amount of hand 
scoring required is estimated to increase testing expenses. Given that 
significant expenses may be associated with testing, GAO is recommending that 
the Department of Education facilitate the sharing of information on states* 
experiences in attempting to reduce expenses. Eight appendixes contain 
supplemental information and summaries of reporting requirements under the 
NCBLA. (Contains 13 tables and 7 figures.) (SLD) 
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Why GAO Did This Stody 

The No Child Left Behind Act of 
2001 (NCLBA) reauthorized the 
$10 bUlion Title I program, which 
seeks to improve the educational 
achievement of 12.5 million 
students at risk. In passing the 
legislation, Congress increased the 
frequency with which states are to 
measure student achievement in 
mathematics and reading and 
added science as another subject. 
Congress also authorized funding 
to support state efforts to develop 
and implement tests for this 
purpose. 

Congress mandated that GAO study 
the costs of implementing the 
required tests. This report 
describes characteristics of states' 
Title I tests, provides estimates of 
what states may spend to 
implement the required tests, and 
identifies factors that explain 
variation in expenses. 



What GAO Recommends 



Given that significant expenses 
may be associated with testing, 
GAO is recommending that 
Education facilitate the sharing of 
information on states' experiences 
in attempting to reduce expenses. 
Education agreed with GAO's 
recommendation but raised 
concerns about GAO's 
methodology for estimating 
expenditures. 
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To view the full report, including the scope 
and methodology, click on the link above, 
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g 512-7215 or shaulm@gao.gov. 
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Whan GAO F©yoc8 

The majority of states administer statewide tests and customize questions to 
measure student learning against their state standards. These states differ 
along other characteristics, however, including the types of questions on 
their tests and how they are scored, the extent to which actual test questions 
are released to the public following the tests, and the number of new tests 
they need to develop to comply with the NCLBA. 



GAO provides three estimates of total expenditures between fiscal year 
2002 and 2008, based on different assumptions about the types of test 
questions states may choose to implement and how they are scored. The 
method by which tests are scored largely explains the differences in GAO's 
estimates. 



If all states use tests with multiple-choice questions, which are machine 
scored, GAO estimates that the total state expenditures will be about 
$1.9 billion. If all states use tests with a mixture of multiple-choice questions 
and a limited number of open-ended questions that require students to write 
their response, such as an essay, which are hand scored, GAO estimates 
spending to be about $5.3 billion. GAO estimates that spending will be at 
about $3.9 billion, if states keep the mix of question types states reported to 
GAO. In general, hand scoring is more expensive and time and labor 
intensive than machine scoring. Benchmark funding for assessments as 
specified in NCLBA will cover a larger percentage of estimated expenditures 
for tests comprised of multiple-choice questions and a smaller percentage of 
estimated expenditures for tests comprised of a mixture of multiple-choice 
and open-ended questions. Several states are exploring ways to reduce 
assessment expenses, but information on their experiences is not broadly 
shared among states. 



6 Dollars In billions 



5.3 




Estimates 



Source: GAO analysis. 
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Accountability * Integrity * Reliability 



United States General Accounting Office 
Washington, DC 20648 



May 8, 2003 

The Honorable Judd Gregg 

Chairman, Committee on Health, Education, 

Labor, and Pensions 
United States Senate 

The Honorable Edward M. Kennedy 
Ranking Minority Member, Committee on 
Health, Education, Labor, and Pensions 
United States Senate 

The Honorable John A. Boehner 
Chairman, Committee on Education 
and the Workforce 
House of Represenatives 

The Honorable George MiUer 
Ranking Minority Member, Committee on 
Education and the Workforce 
House of Representatives 

Title 1, the largest source of federal funding for primary and secondary 
education, provided states $10.3 biUion in fiscal year 2002 to improve the 
educational achievement of 12.5 million students at risk. In passing the No 
Child Left Behind Act of 2001 (NCLBA), Congress increased funding for 
Title 1 and placed additional requirements on states and schools for 
improving student performance. To provide an additional basis for making 
judgments about student progress, NCLBA increased the frequency with 
which states are to assess students in mathematics and reading and added 
science as another subject. Under NCLBA, states can choose to administer 
statewide, local, or a combination of state and local assessments, but these 
assessments must measure states’ content standards for learning. If a state 
fails to fulfill NCLBA requirements, the Department of Education 
(Education) can withhold federal funds designated for state 
administration imtU the requirements have been fulfilled. To support states 
in developing and implementing their assessments. Congress authorized 
specific funding to be allocated to the states between fiscal year 2002 and 
2007. 
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NCLBA requires that states test all students annually in grades 3 through 
8 in mathematics and reading or language arts and at least once in one of 
the high school grades by the 2005-06 school year. It also requires that 
states test students in science at least once in elementary, middle, and high 
school by 2007-08. Some states have already developed assessments in 
many of the required subjects and grades. 

In the conference report accompanying passage of the NCLBA, Congress 
mandated that we do a study of the anticipated aggregate cost to states, 
between fiscal ye 2 U’ 2002 and 2008, for developing and administering the 
mathematics, reading or language arts, and science assessments required 
under section 1111(b) of the act. As agreed with your offices, this report 
(1) describes ch 2 U’acteristics of states’ Title I assessments and (2) provides 
estimates of what states may spend to implement the required 
assessments between fiscal year 2002 and 2008 and identifies factors that 
explain variation in expenses.’ 

To determine the ch 2 U’acteristics of states’ Title I assessments, we 
collected information through a survey sent to the 50 states, the District of 
Columbia, and Puerto Rico; all 52 responded to our survey. We also 
reviewed published studies detailing the characteristics of states’ 
assessments. To estimate projected expenditures all states are expected to 
incur, we reviewed 7 states’ expenditures — all of which had implemented 
the 6 assessments required by the 1994 Elementary and Secondary 
Education Act (ESEA) reauthorization and were testing students in many 
of the additional subjects and grades required by NCLBA. The 7 states 
were Colorado, Delaware, Maine, Massachusetts, North Carolina, Texas, 
and Virginia. To estimate projected expenditure ranges for all states, we 
used expenditures from these 7 states coupled with key information 
gathered through a survey completed by each state’s assessment director. 
We estimated projected state expenditures for test development, 
administration, scoring, and reporting results for both assessments that 
states need and assessments that states currently have in place. Our 
methodology for estimating expenditures was reviewed by several internal 
and external experts and their suggestions have been incorporated as 
appropriate. Education officials were also briefed on our methodology and 
raised no substantial concerns. As agreed with your offices, we did not 



‘NCLBA authorizes funding through fiscal year 2007 for assessments. However, consistent 
with the mandate for this study, we examined expenditures between fiscal years 
2002 through 2008, enabling us to more fully capture expenditures associated with the 
science assessments, which are required to be administered in school year 2007-08. 
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determine expenditures for alternate assessments for students with 
disabilities nor expenditures for English language proficiency testing. In 
addition, we did not determine the expenditures local school districts may 
incur with respect to these assessments. To determine what factors ^ 
account for variation in projected expenditures, we reviewed the 7 states 
expenditures, noting the test characteristics that were associated with 
specific types and levels of expenditure. We supplemented our 
examination of state expenditures with interviews of test publishers and 
contractors and state assessment officials in these states regarding the 
factors that account for price and expenditure variation. The expenditure 
data that we received were not audited. Actual expenditures may vary 
from projected amounts, particularly when events or circumstances are 
different from those assumed. All estimates are reported in nominal 
dollars unless otherwise noted. 

We conducted our work in accordance with generally accepted 
government auditing standards between April 2002 and March 2003. 

(See app. I for more details about our scope and methodology.) 



The majority of states share two characteristics— they administer 
statewide assessments rather than individual local assessments and use 
customized questions to measure the content taught in the state schools 
rather than questions from commercially available tests. However, states 
differ in many other respects. For example, some states use assessments 
that include multiple-choice questions and other states include a mixture 
of multiple-choice questions and a lunited number of questions that 
require students to write their response, such as an essay. Many states that 
use questions that require students to write their response believe that 
such questions enable them to more effectively measure certain sldUs, 
such as writing. However, others believe that multiple-choice questions 
also allow them to assess such skills. In addition, some states make actual 
test questions available to the pubUc after testing but differ with respect to 
the percentage of test questions they publicly release and consequently, 
the number of questions they will need to replace. States also vary m the 
number of new tests they reported needing to develop to comply with the 
NCLBA, which ranged from 0 to 17. 

We provide three estimates — $1.9, $3.9, and $5.3 billion of total spending 
by states between fiscal year 2002 and 2008, with the method by which 
assessments are scored largely explaining the differences in our estimates. 
These estimates are based on expenditiures associated with new 
assessments as well as existing assessments. The $1.9 billion estimate is 
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based on the assumption that all states will use multiple-choice questions, 
which are machine scored. The $3.9 billion estimate is based on the 
assumption that all states keep the mix of question types — whether 
multiple-choice or a combination of multiple-choice and open-ended — 
states reported to us. The $5.3 billion estimate is based on the assumption 
that all states will use a combination of multiple-choice questions and 
questions that require students to write their response, such as an essay, 
which are hand scored. Several states are exploring ways to reduce 
assessment expenses. This information could be beneficial to others, 
however, it is currently not being broadly shared. Given that significant 
expenses may be associated with testing, we are recommending that 
Education facilitate the sharing of information on states’ experiences as 
they attempt to reduce expenses. Education agreed with our 
recommendation, but raised concerns about our methodology for 
estimating expenditures. 



Background 



Enacted as part of President Johnson’s War on Poverty, the original 
Title I program was created in 1965, but the 1994 and most recently, the 
2001 reauthorization of ESEA, mandated fundamental changes to 
Title I. The 1994 ESEA reauthorization required states to develop state 
standards and assessments to ensure that students served by Title I were 
held to the same standards of achievement as other students. Some states 
had already implemented assessments prior to 1994, but they tended to be 
norm referenced — a student’s performance was compared to the 
performance of all students nationally. The 1994 ESEA reauthorization 
required assessments that were criterion referenced — students’ 
performance was to be judged against the state standards for what 
children should know and be able to do.^ In passing the NCLBA, Congress 
built on the 1994 requirements by, among other things, increasing the 
number of grades and subject areas in which states were required to 
assess students, as shown in table 1. NCLBA requires annual testing of 
students in third through eighth grades, in mathematics and reading or 
language arts. It also requires mathematics and reading or language arts 
testing in one of the high school grades (10-12). States must also assess 



norm referenced test evaluates an individual’s performance in relation to the 
performance of a large sample of others, usually selected to represent aU students 
nationally in the same grade or age range. Criterion referenced tests are assessments that 
measure the mastery of specific skills or subject content and focus on the performance of 
an individual as measured against a standard or criterion rather than the performance of 
others taking the test. 
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students in science at least once in elementary (3-5), middle (6-9), and high 
school (10-12). NCLBA gives the states until the 2005-06 school year to 
administer the additional mathematics and reading or language arts 
assessments and until the 2007-08 school year to administer the science 
assessments (see app. II for a summary of Title I assessment 
requirements). 



Table 1 ; Number of Assessments and Subject Areas Required by the 1994 and 
2001 ESEA Reauthorizations 


Number of required assessments 


Subject 


1994 ESEA 2001 ESEA 

reauthorization reauthorizatlon 


Reading or language arts 


3 


7 


Mathematics 


3 


7 


Science 


0 


3 


Total 


6 


17 



Source: P.L. No. 103-382 (1994) and P.L. No. 107-110 (2001). 

Unlike the 1994 ESEA reauthorization, NCLBA does not generally permit 
Education to allow states additional time to implement these assessments 
beyond the stated time frames.'* Under the 1994 ESEA reauthorization. 
Congress allowed states to phase in the 1994 ESEIA assessment 
requirements over time, giving states until the beginning of the 2000-01 
school year to fully implement them with the possibility of limited time 
extensions. In April 2002, we reported that the m^ority of states were not 
in compliance with the Title I accountability and assessment provisions 
required by the 1994 law.'* 

Every state applying for Title I funds must agree to implement the changes 
described in the 2001 act, including those related to the additional 
assessments. In addition to the regular Title I state grant, NCLBA 
authorizes additional funding to states for these assessments between 
fiscal year 2002 and 2007.^ These funds are to be allocated each year to 



'*The Secretary of Education may provide states 1 additional year if the state demonstrates 
that exceptional or uncontrollable circumstances, such as a natural disaster or precipitous 
and unforeseen decline in the financial resources of the state prevented fiiU 
implementation of the academic assessments by the deadlines. 

''U.S. General Accounting Office, Title I: Education Needs to Monitor States’ Scoring of 
Assessments, GAO-02-393 (Washington, D. C.: Apr. 1, 2002). 

^According to Education, there are also other sources of funding in NCLBA that states may 
draw upon for assessment related expenses. 

BEST COPY AVAILABLE 
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states, with each state receiving $3 million, regardless of its size, plus an 
amount authorized based on its share of the nation’s school age 
population. States must use the funds to pay the cost of developing the 
additional state standards and assessments. If a state has already 
developed the required standards and assessments, it may use these funds 
to, among other things, develop challenging state academic content and 
student academic achievement standards in subject areas other than those 
required under Title 1 and to ensure the validity and reliabiUty of state 
assessments. NCLBA authorized $490 million for fiscal year 2002 for state 
assessments and such funds as may be necessary through fiscal year 
2007. However, if in any year Congress appropriates less than the amounts 
shown in table 2, states may defer or suspend testing; however, states are 
still required to develop the assessments. In fiscal year 2002, states 
received $387 million for assessments. 





Fiscal year 


Appropriation benchmark 


2002 


$370,000,000 


2003 


380,000,000 


2004 


390,000,000 


~2005 


400,000,000 


2006 


400,000,000 


~2007 


400,000,000 


Total 


$2.34 billion 


Source: P.L No. 107-1 10 (2001). " 



Other organizations have provided cost estimates of implementing the 
required assessments. The National Association of State Boards of 
Education (NASBE) estimated that states would spend between 
$2.7 to $7 billion to implement the required assessments. 

Accountability Works estimated that states would spend about $2.1 billion.® 

States can choose to use statewide assessments, local assessments, or 
both to comply with NCLBA. States can also choose to develop their own 
test questions or augment commercially available tests with questions so 



NASBE and AccountabilityWorks made different assumptions regarding what costs would 
number of students tested and which would be invariant costs. For example 
NASBE assumed that development costs would vary by the number of students taking the 
tested AccountabUityWorks did not. Additionally, AccountabihtyWorks reports having 
verified its assumptions ivith officials from two states, while the authors of the NASBE 
study do not report having verified their assumption vrith state officials. 
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States Generally 
Report Administering 
Statewide 
Assessments 
Developed to Measure 
Their State Standards, 
but Differ Along Other 
Characteristics 
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that they measure what students are actually taught in school. However, 
NCLBA does not permit states to use commercially available tests that 
have not been augmented. 

NCLBA provides Education a varied role with respect to these 
assessments. Education is responsible for determining whether or not 
states’ assessments comply with Title I requirements. States submit 
evidence to Education showing that their systems for assessing students 
and holding schools accountable meet Title I requirements, and Education 
contracts with individuals who have expertise in assessments and Title I to 
review this evidence. The experts provide Education with a report on the 
status of each state regarding the degree to which a state’s system for 
assessing students meets the requirements and, therefore, warrants 
approval. Under NCLBA, Education can withhold federal funds provided 
for state administration until Education determines that the state has 
fulfilled those requirements.^ Education’s role also includes reporting to 
Congress on states’ progress in developing and implementing academic 
assessments, and providing states, at the state’s request, with technical 
assistance in meeting the academic assessment requirements. It also 
includes disseminating information to states on best practices. 



The majority of states report using statewide assessments developed to 
measure student learning against the content they are taught in the states 
schools, but their assessments differ in many other ways. For example, 
some states use assessments that include multiple-choice questions, whUe 
others include a mixture of multiple-choice questions and questions that 
require students to write their answer by composing an essay or showing 
how they calculated a math answer. In addition, some states make actual 
test questions available to the public but differ with respect to the 
percentage of test questions they publicly release. Nearly all states provide 
accommodations for students with disabilities and some states report 
offering their assessments in languages other than English. States also 
vary in the munber of new tests they will need to develop to comply with 
the NCLBA. 



^This amount is generally 1 percent of the amount that states receive under Title I or 
$400,000, whichever is greater. 
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The Majority of States Use 
Statewide Tests That They 
Report Are Written to 
Their State Standards 



Forfy-six states currently administer statewide tests to students and 
44 plan to continue using statewide tests for future tests NCLBA requires 
them to add.® (See fig. 1.) Only 4 states— Idaho, Kansas, Pennsylvania, and 
Nebraska currently use a combination of state and local assessments and 
only Iowa currently uses all local assessments. 



Figure 1 : The Majority of States Report They Currentiy Use Statewide Tests and 
Pian to Continue to Do So 



Current 



Future 



2 %( 1 ) 




4 % (2) 





Statewide 

Local 

Combination 
Don't know/mlssing 



Source: GAO survey. 



Note: Percentages do not add to 100 because of rounding. 



The majority of states (31) report that all of the tests they currently use 
consist of questions customized, that is, developed specifically to assess 
student progress against their state’s standards for learning for every grade 
and subject tested. (See fig. 2.) Many of the remaining states are using 
different types of tests for different grades and subjects. For example, 
some states are using customized tests for some grades and subjects and 



®The District of Columbia and Puerto Rico are included in our state totals. 



O 

ERIC 



Page 8 



14 



GAO-03-389 Title I 



commerciaUy available tests for other grades and subjects. Seven states 
reported using only conunercially available tests in all the grades and 
subjects they tested. 

In the future, the msyority of states (33) report that all of their tests wiU 
consist of customized questions for every subject and grade. Moreover, 
those states that currently use commercially available tests report plans to 
replace these tests with customized tests or augment commercially 
available tests with additional questions to measure what students are 
taught in schools, as required by NCLBA. 

Figure 2: The Majority of States Reported That They Currently Use and Plan to 
Develop New Tests That Are Customized to Measure Their State’s Standards 

Type of test 

Current 





Customized test only 
Other 

Don't know/mIssIng 



Source: GAO survey. 

Note: Percentages do not add to 100 due to rounding. In the current 
that reported using commercially available tests for all grades and subjects tested that 
augmented with additional questions to measure state standards. These states reported plans to 
augment these tests with additional questions or replace them with customized tests. 
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States in Approach to in developing their assessments, nearly all states (50) reported providing 
bpecmc Accommodations specific accommodations for students with disabiUties.® These often 

include Braille, large print, and audiotape versions of their assessments for 
visually impaired students, as well as additional time and oral 
administration. 

About a quarter of the states (12) report offering these assessments in 
languages other than EngUsh, typically Spanish. Both small and larger 
states scattered across the United States offer assessments in languages 
besides English. For example, states such as Wyoming and Delaware and 
large states such as Texas and New York offer Spanish language versions 
of their assessments. New York and Minnesota offer their assessments in 
as many as four other languages besides English. While a quarter of the 
states currently translate or offer assessments in languages other than 
EngUsh, additional states may provide other accommodations for students 
with limited English proficiency, such as additional time to take the test, 
use of bilingual dictionaries, or versions of the test that limit use of 
idiomatic expressions. 



States Using Different Thirty-six states report they currently use a combination of multiple- 

of Questions to choice and a lunited number of open-ended questions for at least some of 

Assess Students the assessments they give their students. (See fig. 3.) For example, in 

Florida, third grade students’ math skills are assessed using multiple- 
choice questions, while fifth grade students’ math skUls are assessed using 
a combination of multiple-choice and open-ended questions. 'Twelve states 
reported having tests that consist entirely of multiple-choice questions. 

For example, all of Georgia’s and Virginia’s tests are multiple-choice. 
Almost half of the states reported that they had not made a decision about 
the ratio of multiple-choice to open-ended questions on future tests. Of the 
states that had made a decision, most reported plans to develop 
assessments using the same types of questions they currently use. 



Vwo states reported that they did not provide accommodations for students with 
disabihhes at the state level, however, accommodations may have been provided at the 
local school level 

‘"New York ^ers its assessments in Spanish, Korean, Haitian Creole, and Russian and 
Mmnesota offers its mathematics assessments in Spanish, Hmong, Somali, and Vietnamese. 
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Figure 3: The Majority of States Reported They Use a Combination of Multiple- 
choice and Open-ended Questions on Their Tests, but Many States Are Uncertain 
about Question Type on Future Tests 



Question type 

Current Future 






Mix of multiple-choice and written response 

Multiple-choice 

Don't know 

Missing 



Source; GAO survey. 



States choose to use a mixture of question types on their tests for varying 
reasons. For example, some officials believe that open-ended questions, 
requiring both short and long student responses, more effectively measure 
certain skills such as writing or math computation than multiple-choice 
questions. Further, they believe that different question types will render a 
more complete measure of student knowledge and skills. In addition, state 
laws sometimes require test designers to use more than one type of 
question. In Maine, for example, state law requires that all state and local 
assessments employ multiple measures of student performance. 
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States Split as to Whether 
They Make Actual Test 
Questions Available to the 
Public Following Tests 



Slightly over half of the states currently release actual test questions to the 
public, but differ in the percent of questions they release. (See fig. 4.) 
Texas, Massachusetts, Maine, and Ohio release their entire tests to the 
public following the tests, allowing parents and other interested parties to 
see every question their children were asked. Other states, such as New 
Jersey and Michigan release only a portion of their tests. Moreover, even 
those states that do not release questions to the general public may release 
a portion of the questions to teachers, as does North Carolina, so that they 
can better understand areas where students are having the most difficulty, 
and improve instructions. States that release questions must typically 
replace them with new questions. 



Figure 4: States Split in Decision to Release Test Questions to the Public Following 
Tests 



2 %( 1 ) 





Do release 
Do not release 
Don't know/mIssIng 



Source: GAO survey. 



Often, states periodically replenish their tests with new questions to 
improve test security. For example, states like Florida, Kentucky, 
Maryland, and South Carolina that do not release test questions, replenish 
or replace questions periodically. 

In addition to replenishing test items, many states use more than one 
version for each of their tests and do so for various reasons. For example, 
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States Vary in the Number 
of Additional Tests They 
Reported They Need to 
Develop or Augment 



o 

ERIC 



Virginia gives a different version of its test to students who may have been 
absent. Some states use multiple test versions of their high school tests to 
allow those students who do not pass it to take it multiple times. Still other 
states, such as Massachusetts and Maine, use multiple versions to enable 
the field testing of future test questions. 



States differ in the number of additional tests they reported they need to 
meet NCLBA requirements, with some having all of the tests needed while 
others will need to develop new tests or augment commercially available 
tests with additional questions to fulfill the new requirements for a total of 
17 tests. (See table 3.) Appendix III has information on the number of tests 
each state needs to develop or augment to comply with NCLBA. 

The miyority of states (32) report they will need to develop or augment 
9 or fewer tests and the rest (20) will need to develop or augment 10 or 
more tests. Eight states — ^Alabama, New Mexico, Montana, South Dakota, 
Idaho, West Virginia, Wisconsin, and the District of Columbia report that 
they need to develop or augment all 17 tests. Maryland is also replacing a 
large number of its tests (15); although its assessments were certified as 
compliant with the 1994 law, the tests did not provide scores for individual 
students. Although Education waived the requirement that Maryland’s 
tests provide student level data, Maryland is in the process of replacing 
them so that it can provide such data, enabling parents to know how well 
their children are performing on state tests. 



Table 3: The Number of Tests States Reported Needing to Develop or Augment 
Varies 


Range in number of test 

states need to comply with NCLBA 


Number 
of states 


None 


5 


1-3 


4 


4-6 


6 


7-9 


17 


10-12 


10 


13 or more 


10 



Source: GAO survey. 



Most states reported plans to inunediately begin developing the tests, 
which according to many of the assessment directors we spoke with, 
typically take 2 to 3 years to develop. For example, most states reported 
that by 2003 they will have developed or will begin developing the reading 
and mathematics tests that must be administered by the 2005-06 school 
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year. Similcirly, most states reported that by 2005 they will have developed 
or will begin developing the science tests that must be administered by the 
2007-08 school yecu:. 

To help them develop these tests, most states report using one or more 
outside contractors to help manage testing programs. Nearly all states 
report that developing, administering, scoring, and reporting will be a 
collaborative effort involving contractors and state and local education 
agencies. However, while states report that contractors and state 
education agencies will shcure the primary role in developing, scoring, and 
reporting new assessments, local education agencies will have the primary 
role in administering the assessments. 



Estimates of Spending 
Driven Largely by 
Scoring Expenditures 



We provide three estimates — $1.9, $3.9, and $5.3 bilhon — of total state 
spending between fiscal years 2002 and 2008 for test development, 
administration, scoring, and test reporting. These figures include 
estimated expenses for assessments states will need to add as well as 
continuing expenditures associated with assessments they currently have 
in place. The method of scoring largely explains the differences in the 
estimates. However, various other factors, such as the extent to which 
states release assessment questions to the pubhc after testing and 
therefore need to replace them, also affect expenditures. Between states, 
however, the number of students assessed will largely explain variation in 
expenditures. Moreover, because expenditures for test development cu:e 
small in relation to test administration, scoring, and reporting 
(nondevelopment expenditures), we estimate that state expenditures may 
be lower in the first few years when states are developing their 
assessments and higher in subsequent years as states begin to administer 
and score them and report the results. 



Different Estimates 
Primarily Reflect 
Differences in How 
Assessments Are Scored 



We estimate that states may spend $1.9, $3.9, or $5.3 billion on 

Title I assessments between fiscal years 2002 through 2008, with scoring 

expenditures largely accounting for differences in our estimates. 

Table 4 shows total state expenditures for the 17 tests required by 
Title I. In appendix IV, we also provide separate estimates for expenses 
associated with the subset of the 17 assessments that states reported they 
did not have in place at the time of our survey but are newly required by 
NCLBA. 
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Table 4: Estimated Expenditures by States for Title I Assessments, 
Fiscal Years 2002-08 



Question type 


Estimate 


Questions and scoring methods used 


Multiple-choice 


$1.9 billion 


Estimate assumes that all states use machine- 
scored multiple-choice questions. 


Current question 
type 


$3.9 billion 


Estimate assumes that states use the mix of 
question types reported in our survey. 


Multiple-choice 
and open-ended 


$5.3 billion 


Estimate assumes that all states use both 
machine-scored multiple-choice questions and 
some hand scored open-ended questions. 



Source: GAO projections based on state assessment plans and characteristics and expenditure data gathered from 7 states. 



The $1.9 billion estimate assumes that all states will use multiple-choice 
questions on their assessments. Multiple-choice questions can be scored 
by scanning machines, making them relatively inexpensive to score. For 
instance, North Carolina, which uses multiple-choice questions on aU of its 
assessments and machine scores them, spends approximately 
$0.60 to score each assessment. 

The $3.9 billion estimate assumes that states will implement assessments 
with questions like the ones they currently use or plan to use based on 
state education agency officials’ responses to our survey. However, 

25 states reported that they had not made final decisions about question 
type for future assessments. Thus, the types of questions states ultimately 
use may be different from the assessments they currently use or plan to 
use. 

Finally, the $5.3 billion estimate assumes that all states will implement 
assessments with both multiple-choice and open-ended questions. 
Answers to open-ended questions, where students write out their 
responses, are typically read and scored by people rather than by 
machines, making them much more expensive to score than answers to 
multiple-choice questions. We foimd that states using open-ended 
questions had much higher scoring expenditures per student than states 
using multiple-choice questions, as evidenced in the states we visited, as 
shown in figure 5." For example, Massachusetts, which uses many open- 
ended questions on its Title I assessments, spends about 
$7.00 to score each assessment. Scoring students’ answers to open-ended 
questions in Massachusetts involves selecting and training people to read 



”ln Texas and Colorado, we were unable to separate scoring expenditures from other 
types of expenditures. 
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and score the answers, assigning other people to supervise the readers, 
and providing a facility where the scoring can take place. In cases where 
graduation decisions depend in part on a student’s score on the 
assessment, the state requires that two or three individuals read and score 
the student’s answer. By using more than one reader to score answers, 
officials ensure consistency between scorers and are able to resolve 
disagreements about how weU the student performed. 



Figure 5: Estimated Scoring Expenditures Per Assessment Taken for Selected 
States, Fiscal Year 2002 

8 Estimated scoring expenditures per assessment taken 





Open-ended and multiple-choice 
Primarily multiple-choice 



Source: GAO analysis of expenditure data provided by state education agencies. 



We estimate that, for most states, much of the expense associated with 
assessments will be related to test scoring, administration, and reporting, 
not test development, which includes such expenses as question 
development and field testing.’^ (See table 5.) In Colorado, for example. 



’^his may not be true for smaller states because they may have fewer assessments to 
administer, score, and report. 
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test administration, scoring, and reporting expenditures comprise 
89 percent of the total expenditures, while test development expenditures 
comprised only 11 percent. (See app. V for our estimates of development 
and nondevelopment expenditures by state.) 



Table 5: Estimated Total Expenditures for Test Development Are Lower Than for 
Test Administration, Scoring, and Reporting 


In millions 




Multiple-choice 


Current 
question type 


Multiple-choice 
and open-ended 


Development 


$668 


$706 


$724 


Administration, 
scoring, and reporting 


1,233 


3,237 


4,590 


Total 


$1,901 


$3,944 


$5,313 



Source: GAO projections based on state assessment plans and characteristics and expenditure data gathered from 7 states. 



Various Factors are Likely 
to Affect Expenditures for 
Title I Assessments 



While the scoring method explains a great deal of the variation in 
expenditures among states, other factors are likely to affect expenditures. 
These factors include the number of different test versions used, the 
extent to which the state releases assessment questions to the public after 
testing, fees for using copyrighted material, and factors unique to the state. 
(See fig. 6.) For example, states that use multiple test versions will have 
higher expenditures than those that have one. Massachusetts used 
24 different test versions for many of its assessments and spent 
approximately $200,000 to develop each assessment. Texas used only 
1 version for its assessments and spent approximately $60,000 per 
assessment. In addition, states that release test items to the public or 
require rapid reporting of student test scores are likely to have higher 
expenditures than states that do not because they need to replace these 
items with new ones to protect the integrity of the tests and assign 
additional staff to more rapidly score the assessments by the specified 
time frame. States that customize their assessments may have higher 
expenditures than states that augment commercially available tests. 
Moreover, factors unique to the state may affect expenditures. Maine, 
which had one of the lowest assessment development expenses of all of 
the states we visited (about $22,000 per assessment), has a contract with a 
nonprofit testing company. 



Between states, the number of students tested generally explains much of 
the variation in expenditures, particularly when question types are similar. 
States with large numbers of students tested will generally have higher 
expenditures than states with fewer students. 



O 



Page 17. 



GAO-03-389 Title I 



Figure 6: Various Factors Are 
Assessments 



Likeiy to Affect What States Spend on Titie i 



Factor 



Likely affect on estimated 
expenditures 



Number of students taking assessments ^ 

|Custombirig ^essments to align w|th state star^aibs , ^7? . ''' ^ 

Extent of public release of questions ^ 

Nu^bprpfdiffere^^ 

Faster turnaround time for scoring ^ 

factors unique to the State.' , 

Source: State education agency official interviews. 



Benchmark Amounts in 
NCLBA Will Cover Varying 
Portions of States’ 
Estimated Expenditures 
and Amount Covered Will 
Vary Primarily by Type of 
Test Questions States Use 



Using the benchmark funding levels specified in NCLBA, we estimate that 
these amounts would cover varying portions of estimated expenditures. 
(See table 6.) In general, these benchmark amounts would cover a larger 
percentage of the estimated expenditures for states that choose to use 
multiple-choice tests. To illustrate, we estimated that Alabama would 
spend $30 million if it continued to use primarily multiple-choice 
questions, but $73 nulhon if the state used assessments with both multiple- 
choice and open-ended questions. The specified amount would cover 
151 percent of Alabama’s estimated expenditures if it chose to use all 
multiple-choice questions, but 62 percent if the state chose to use both 
multiple-choice and open-ended questions. 
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Table 6: Total Estimated Expenditures by States for Title I Assessments, Fiscal Years 2002-08 



Estimates (in millions) 



Multiple- 

choice 



Alabama 



Alaska 



Arizona 



Arkansas 



California 



Colorado 

Connecticut 

Delaware 



District of 
Columbia 



Florida 



Georgia 



Hawaii 



Idaho 



Illinois 



Indiana 



Iowa 



Kansas 



Minnesota 



Mississippi 



Missouri 



Montana 



Nebraska 



Nevada 



New 
Hampshire 



New Jersey 
New Mexico 



New York 



North Carolina 
North Dakota 
Ohio 



Oklahoma 



$30 



17 



39 



23 



178 



32 



28 



14 



13 

83 



54 



17 



18 



65 



40 



24 



23 



Kentucky 

Louisiana 

Maine 

Maryland 

Massachusetts 
Michigan 



28 



31 



18 



35 



38 



57 



34 



25 



36 



18 



18 



21 



17 



43 



21 



83 



49 



16 



Current 

question 

type 



55 



27 



Multiple- 
choice and 
open-ended 



Appropriation 
benchmark 
(in millions)" 



$30 



25 



108 



42 



235 



87 



68 



24 



13 



211 



54 



31 



23 



164 



113 



62 



36 



62 



81 



33 



91 



109 

177 



91 



63 



99 



28 



34 



26 



32 



127 



39 



276 



49 



23 



171 



37 



$73 



28 



108 



53 



632 



87 



68 



24 



17 



281 



174 



31 



30 



211 



113 



62 



51 



71 



81 



33 



91 



109 



177 



91 



63 



99 



29 



34 



45 



32 



127 



41 



276 



152 



23 



171 



66 



$46 



26 



51 



37 



219 



46 



41 



26 



24 



102 



67 



28 



30 



92 



56 



38 



38 



43 



49 



29 



51 



55 



80 



51 



39 



54 



27 



32 



33 



29 



67 



33 



121 



65 



26 



86 



42 



Appropriation benchmark as percent of 
estimated expenses 
Current 



Multiple- 

choice 

151% 



question 

type 



151% 



Multiple-choice 
and open- 
ended 

62% 



154 



132 



158 



123 



145 



147 

183 



184 



123 



124 



162 



167 



141 



140 



15 ^ 

164 



155 



158 



159 



146 



144 



140 



149 



154 



150 



149 



177 



152 



168 



153 

155 



146 



132 



162 



158 



156 



106 



47 



88 



93 



53 



59 



106 



184 



48 



124 

91 



131 



56 



49 



62 



106 



70 



60 



86 



56 



50 



45 



56 



61 



54 



97 



93 



125 



92 



53 



84 

44 

132 



109 



50 



114 



93 



47 



70 



35 



53 



59 



106 



144 

36 



39 



91 



98 

44 



49 



62 



73 



61 



60 



86 



56 



50 



45 



56 



61 



54 



93 



72 



92 



53 



81 



44 



43 



109 



50 



63 
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West Virginia 
Wisconsin 



118 


55 


135 


43 


31 


135 


72 


53 


180 




Total Expenditures Likely 
to Be Lower in the First 
Few Years, Increasing Over 
Time as States Begin to 
Administer, Score, and 
Report Additional 
Assessments 



to estimate spending for fiscal vear POOR fnr ^ addition, because we were mandated 

numbers of tests are administered, scored, and reported. As a result the 
benchmark funding amounts in NCLBA would cover a larger percen^e of 
esf^ted expenditures in the tot few years. Under some 0 ^^^“ 
hhe ftmdmg benchmarks in NCLBA exceed estimated state expendtomr 

® n°*r" ’’’ aiiocation would 

more than cover all of the estimated expenses if all states were to use 

continue with the types of questions they 
rrently use. If all states were to choose to use a mixture of multiple- 

2002‘fu^?n^^''lH questions, the most expensive option, fiscal year 
2002 fundmg would cover 84 percent of states’ total expenditures. We 

^PP- ^ for fiscal year 

2002 through 2008 estimated expenditures for each question type ) 

S'l^f'ihp'lT benchmark funding would continue to cover 

all of the estimated expenditures if all states were to use all multiple- 
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choice questions, about two-thirds of estimated expenditures if all states 
continued using their current mix of questions, and a little over 50 percent 
of estimated expenditures if all states were to use a mixture of question 
types, the most expensive option. 



Benchmark Funding in NCLBA Estimated to Cover 



1000 Dollars In millions 



800 



600 




2002 2003 
Fiscal year 




2004 



2005 



2006 



2007 






2008 





Benchmark appropriations 
Multiple-choice 
Current question type 
Multiple-choice and open-ended 



Source: GAO analysis. 



Opportunities May Exist to 
Share Information on 
Efforts to Reduce Testing 
Expenditures 



Some states are exploring ways to control expenses related to 
assessments and their experiences may provide useful information to 
other states about the value of various methods for controlling 
expenditures. Recently, several states, in coryunction with testing industry 
representatives, met to discuss ways of reducing test expenditures. For 
example, the group discussed a range of possible options for reducing 
expenditures, including computer-administered tests; commercially 
available tests that can be customized to states standards by adding 
additional questions; computerized scoring of written responses, and 
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computer scaiming of students’ written responses. Information about 
individual states experiences as they attempt to reduce expenses could 
benefit other states. However, such information is currently not 
systematically shared. 


Conclusions 


The 1994 and 2001 ESEIA. reauthorizations raised student assessments to a 
new level of importance. These assessments are intended to help ensure 
that all students are meeting state standards. Congress has authorized 
funding to assist states in developing and implementing these assessments. 
We estimate that federal funding benchmarks in NCLBA will cover a larger 
percentage of expenses in the first few years when states are developing 
their assessments, with the covered percentage decreasing as states begin 
to administer, score, and report the fiill complement of assessments. 
Moreover, the choices states make about how they will assess students 
will influence expenditures. Some states are investigating ways to reduce 
the expenses, but currently information on states’ experiences in 
attempting to reduce expenses is not broadly shared. We believe states 
could benefit from information sharing. 


Recommendation 


Given the large federal investment in testing and the potential for reducing 
test expenditures, we recommend that Education use its existing 
mechanisms to facilitate the sharing of information on states’ experiences 
as they attempt to reduce expenses. 


Agency Comments 


The Department of Education provided written comments on a draft ot 
this report, which we have summarized below and incorporated in the 
report as appropriate. (See app. VII for agency comments.) Education 
agreed with our recommendation, stating that it looks forward to 
continuing and enhancing its efforts to facilitate information sharing that 
might help states contain expenses. However, Education raised concerns 
about our methodology, noted the availability of additional federal 
resources under ESEA that might support states’ assessment efforts, and 
pointed out that not all state assessment costs are generated by NCLBA. 

With regard to our estimates, we have confidence that our methodology is 
reasonable and provides results that fairly represent potential 
expenditures based on the best available information. Education’s 
comments focus on the uncertainties that are inherent in estimation of any 
land — the necessity of assumptions, the possibility of events or trends not 
readily predicted, and other potential sources of error that are 
acknowledged in the report— without proposing an alternative 
methodology. Because of the uncertainty, we produced three estimates 
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instead of one. In developing our approach, we solicited comments from 
experts in the area and incorporated their suggestions as appropriate. We 
also discussed our estimation procedures with Education staff, who raised 
no significant concerns. Second, Education cites various other sources of 
funds that states might use to finance assessments. While other sources 
may be available, we focused primarily on the amounts specifically 
authorized for assessments in order to facilitate their comparison to 
estimated expenses and because they are the minimum amounts that 
Congress must appropriate to ensure that states continue to develop as 
well as implement the required assessments. 



We are sending copies of this report to the Secretary of Education, 
relevant congressional committees, and other interested parties. Please 
contact me on (202) 512-7215 or Betty Ward-Zukerman on (202) 512-2732 if 
you or your staff have any questions about this report. In addition, the 
report will be available at no charge on GAO’s Web site at 
http;//www.gao.gov. Other GAO contacts and staff acknowledgments are 
listed in appendix VIII. 




Mamie S. Shaul, Director 
Education, Workforce 
and Income Security Issues 



O 




Page 23 



GAO 03 389 Title I 



Appendix I: Objectives, Scope, and 
Methodology 



The objectives of this study were to provide information on the basic 
characteristics of Title I assessments, and to estimate what states would 
likely spend on Title I assessments between fiscal year 2002 and 2008, and 
identify factors that explain variation in estimated expenditures. To 
address the first objective, we collected information from a survey sent to 
the 50 states, the District of Columbia, and Puerto Rico, and reviewed 
documentation fi-om state education agencies and from pubUshed studies 
detailing the characteristics of states’ assessments. To address the second 
objective, we collected detailed assessment expenditure information from 
7 states, interviewed officials at state education agencies, discussed cost 
factors with assessment contractors, and estimated assessment 
expenditures imder three different scenarios. The methods we used to 
address the objectives were reviewed by several external reviewers, and 
we incorporated their comments as appropriate. This appendix discusses 
the scope of the study, the survey, and the methods we used to estimate 
assessment expenditures. 

Providing Information on 
the Basic Characteristics 
of Title I Assessments 



We surveyed all 50 states, the District of Columbia, and Puerto Rico, all of 
which responded to our survey. We asked them to provide information 
about their Title I assessments, including the characteristics of current and 
planned assessments, the number and types of new tests they needed to 
develop to satisfy No Child Left Behind Act (NCLBA) requirements, when 
they planned to begin developing the new assessments, the types of 
questions on their assessments, and their use of contractors. We also 
reviewed documentation from several states about their assessment 
programs and published studies detailing the characteristics of states’ 
assessments. 



Estimating Assessment 
Expenditures and 
Explaining Variation in the 
Estimates 



This study estimates likely expenditures on Title I assessments by states 
between fiscal year 2002 and 2008, and identifies factors that may explain 
variation in the estimates. It does not estimate expenditures for alternate 
assessments for students with disabilitiess for English language 
proficiency testing, or expenditures incurred by school districts.' Instead, 
we estimated expenses states are expected to incur based on expenditure 
data obtained for this purpose fi:om 7 states combined with data on these 
and other states’ assessment plans and characteristics obtained through a 



‘The study also does not estimate the opportunity costs of assessments. 
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Appendix I: Objectives, Scope, and 
Methodology 



survey.^ In the 7 states, we requested information ^lnd documentation on 
expenditures in a standard set of areas, met with state officials to discuss 
the information ^md asked that they review our subsequent analysis of 
information regarding their state. The expenditure data that we received 
from the 7 states were not audited. Moreover, actual expenditures may 
vary from projected amounts, particularly when events or circumstances 
are different from those assumed, such as changes in the competitiveness 
of the market for student assessment or changes in assessment 
technology. 



Selection of 7 States We selected 7 states that had assessments in place in many of the grades 

^md subjects required by the NCLBA from the 17 states with assessment 
systems that had been certified by Education as in compliance with 
requirements of the Improving America’s Schools Act of 1994 when we 
began our work. We included states with varsdng student enrollments, 
including 2 states with relatively small numbers of students. The states we 
selected were Colorado, Delaware, Maine, Massachusetts, North Carolina, 
Texas and Virginia. (See table 7 for information about the selected states.) 



Table 7: States Selected for Study 








Number of assessments 






State 


Date approved 
by Education 


Number of 
students 


Reading 
(out of 7) 


Math 
(out of 7) 


Science 
(out of 3) 


Totai 


Colorado 


July 2001 


724,508 


7 


5 


1 


13 


Delaware 


December 2000 


114,676 


7 


7 


3 


17 


Maine 


February 2002 


207,037 


3 


3 


3 


9 


Massachusetts 


January 2001 


975,150 


5 


4 


3 


12 


North Carolina 


June 2001 


1,293,638 


7 


7 


0 


14 


Texas 


March 2001 


4,059,619 


7 


7 


2 


16 


Virginia 


January 2001 


1,144,915 


4 


4 


3 


11 


Source: U.S. Department of Education, National Center for Education Statistics, and state education agencies. 



Collection of Expenditure We collected detailed assessment expenditure information from officials 
Information from 7 States states. We obtained actual expenditures on contracts and state 

assessment office budget expenditures for fiscal year 2002 for all 7 states 



^Because our expenditure data were limited to 7 states, our estimates may be biased. For 
example, if the 7 states we selected had higher average development expenditures per 
ongoing assessment than the average state, then our estimate of development expenditures 
would be biased upwards. 
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Appendix I: Objectives, Scope, and 
Methodology 



and for previous years in 4 states.® In site visits to the 7 states, we 
interviewed state education agency officials who explained various 
elements of their contracts with assessment publishing firms and the 
budget for the state’s assessment office. To the extent possible, we 
collected expenditure data, distinguishing expenditures for assessment 
development from expenditures for assessment administration, scoring, 
and reporting, because expenditures vary differently between these two 
expenditure categories. Assessment development expenditures vary with 
the number of assessments while administration, scoring, and reporting 
expenditures vary with the number of students taking the assessments. 
(See table 8 for examples of expenditures.) 



Table 8: Examples of Assessment Expenditures 


Type of expenditure 


Example of expenditure 


Development 


Question writing 

Question review (e.g., for bias) 


Administration 


Printing and delivering assessment 
booklets 


Scoring 


Scanning completed booklets into scoring 
machines 


Reporting 


Producing individual score reports 



Source: State education agencies. 



Calculation of Averages for 
Development and for 
Administration, Scoring, 
and Reporting 



Using annual assessment expenditures for all 7 states, the number of 
assessments developed and implemented, and the number of students who 
took the assessments, we calculated average expenditures for ongoing 
development (assessments past their second year of development) and 
average expenditures for administration, scoring, and reporting for each 
state. (See table 9.) 



^We were unable to obtain information on personnel expenditures from 5 of the 7 states, 
and so we did not include personnel expenditures in our analysis. In the 2 states in which 
we obtained personnel expenditures, such expenditures were a relatively small part of the 
assessment budget. 
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Table 9: Average Annual Expenditures for the 7 States (adjusted to 2003 dollars) 



State 


Average development 
expenditures (per ongoing 
assessment) 


Average expenditures for 
administration, scoring, and 
reporting (per assessment taken) 


Both multiple-choice 
and open-ended 
questions 


Multiple- 

choice 

questions 


Colorado 


$72,889 


$10.35 






Delaware 


$66,592 


$8.78 






Maine 


$22,295 


$9.96 






Massachusetts 


$190,870 


$12.45 






North Carolina 


$104,181 


$1.85 






Texas 


$61,453 


$4.72 






Virginia 


$78,489 


$1.80 






Source: GAO analysis of state education agency information. 



Note: We were able to obtain data for more than 1 year for Colorado, Delaware, Maine, 
Massachusetts, and Texas. For these states, we adjusted their average expenditures to 2003 dollars 
and then averaged these adjusted expenditures across the years that data were collected. North 
Carolina did not distinguish Title I assessments from other assessments it offers. 



We provide three estimates of what all states are likely to spend on all of 
the required 17 assessments using the average development expenditure 
and average expenditures for administration, scoring, and reporting by 
question type (multiple-choice or multiple-choice with some open-ended 
questions). One estimate assumes that all states use only multiple-choice 
questions, the second assumes that states will use the types of questions 
state officials reported they use or planned to use, and the third assumes 
that all states will use both multiple-choice and a limited number of long 
and short open-ended questions. All estimates reflect states’ timing of their 
assessments (for example, that science assessments are generally planned 
to be developed and administered later than assessments for reading and 
mathematics). 

To estimate what states would spend under the assumption that they use 
only multiple-choice questions, we took the mean of the average annual 
expenditures per assessment for North Carolina, Texas, and Virginia, 
states that use multiple-choice assessments. To compute an estimate that 
reflected the types of questions states used or planned to use, we used the 
appropriate averages. To illustrate, California reported 15 multiple-choice 
tests and 2 tests that include a combination of multiple-choice and open- 
ended questions. For the 15 multiple-choice tests, we used the mean from 
the multiple-choice states (North Carolina, Texas, and Virginia). For the 
2 multiple-choice and open-ended tests, we used the mean from the states 
that had both question types (Colorado, Delaware, Maine, and 
Massachusetts). To estimate what states would spend, assunung that all 
states use both multiple-choice and open-ended questions, we used the 



Estimating States’ Likely 
Expenditures for 17 Title I 
Assessments 
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mean of the average annual expenditures for Colorado, Delaware, Maine, 
and Massachusetts, states that use both types of questions. 



Estimating Development To estimate development expenditures, we obtained information from 
Expenditures state regarding the number of assessments it needed to develop, the 

year in which it planned to begin development of each new assessment, 
and the number of assessments it already had. For each assessment the 
state indicated it needed to develop, we estimated initial development 
expenditiu-es beginning in the year the state said it would begin 
development and also for the following year because interviews with 
officials revealed that developing an entirely new assessment takes 
approximately 2 to 3 years. For the 7 states that provided data, we were 
typically not able to separate expenditures for new test development from 
expenditures for ongoing test development. Where such data were 
available, we determined that development expenses for new assessments 
were approximately three times the expense of development expenses for 
ongoing assessments, and we used that approximation in oiu" estimates. 
For each state each year, we multiplied the number of tests in initial 
development by three times the average ongoing development expenditure 
to reflect that initial development of assessments is more expensive than 
ongoing development.^ We multiplied the number of ongoing tests by the 
average ongoing development expenditme. The sum of these two products 
provides a development expenditme for each state in each year and 
provides a total development estimate. We calculated three estimates as 
follows: 

• using the expenditure information from states that use multiple-choice 
questions, we produced a lower estimate; 

• using the information from the state survey on the types of tests they 
planned to develop (some indicated both open-ended/multiple-choice tests 
and some multiple-choice), we produced a middle estimate;® and 

• using the expenditure information from the states that use open-ended and 
multiple-choice questions, we produced the higher estimate. 



'* *We found estimates were not sensitive to changes in assumptions regarding development 
costs, partly because they proved to be a generally small portion of overall expenses. 

^For states that reported that they did not know the lands of question they would use on 
future tests, we assumed that future test would be the same as they currently use. Where 
data were missing, we assumed that states would use assessments with both multiple- 
choice and open-ended questions, potentially biasing our estimates upward. 
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Estimating Administration, 
Scoring, and Reporting 
Expenditures 



To produce an estimate for administration, scoring, and reporting, we used 
three variables: the average number of students in a grade; the number of 
administered assessments; and the average administration, scoring, and 
reporting expenditure per assessment taken. We calculated the average 
number of students in a grade in each year using data from the National 
Center for Education Statistics’ Common Core of Data for 2000-01 and 
their Projection of Education Statistics to 2011. We obtained data on the 
number of administered assessments from our state education agency 
survey. Data on average expenditures come from the states in which we 
collected detailed expenditure information. 

For each state in each year, we multiplied the average number of students 
in a grade by the number of administered assessments and by the 
appropriate average assessment expenditure. Summing over states and 
years provided a total estimate for administration, scoring, and reporting. 
As above, we performed these calculations, using the expenditure 
information from multiple-choice states to produce the lower estimate, 
using the information from the state survey and expenditure information 
from both combination and multiple-choice states to produce a middle 
estimate, and using the expenditure information from the combination 
states to produce the higher estimate. We also estimated what states are 
likely to spend on the assessments that states did not have in place at the 
time of our survey, but are required by NCLBA, using the same basic 
methodology. Table 10 provides an overview of our approach to estimating 
states’ likely expenditures on Title 1 assessments. 



O 
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Table 10: Estimated Expenditures to Implement Title I Assessments in a Given Year 


A 


Total estimated development = 

expenditure for ongoing 
assessments 


Number of ongoing 
assessments 


X Average development expenditure for each ongoing 
assessment 


B 


Total estimated development = 

expenditure for new assessments 


Number of new 
assessments 


X Three times the average development expenditure for 
each ongoing assessment 


c 


Total estimated expenditures for = 
administration, scoring, and 
reporting (ongoing and new 
assessments) 


Average number of 
students in each grade 


X Average administration, scoring, and reporting 
expenditure for each assessment taken, times the 
number of assessments administered, for each 
ongoing and new assessment 


A B -I- C = States’ estimated expenditures to implement Title I assessments 



Source: GAO analysis. 



We conducted our work in accordance with generally accepted 
government auditing standards between April 2002 and March 2003. 
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Requirements for 1994 


Requirements for 2001 


Developing standards for content and performance 


Develop challenging standards for what students should know in 
mathematics and reading or language arts. In addition, for each of 
these standards, states should develop performance standards 
representing three levels: partially proficient, proficient, and 
advanced. The standards must be the same for all children. If the 
state does not have standards for all children, it must develop 
standards for Title I children that incorporate the same skills, 
knowledge, and performance expected of other children. 


In addition, develop standards for science content by 2005-06. 
The same standards must be used for all children. 


Implementing and administering assessments 


Develop and implement assessments aligned with the content and 
performance standards in at least mathematics and reading or 
language arts. 


Add assessments aligned with the content and performance 
standards in science by the 2007-08 school year. These 
science assessments must be administered at some time in 
each of the following grade ranges: grades 3 through 5, 6 
through 9, and 10 through 12. 


Use the same assessment system to measure Title I students as the 
state uses to measure the performance of all other students. In the 
absence of a state system, a system that meets Title I requirements 
must be developed for use In all Title I schools. 


Use the same assessment system to measure Title 1 students 
as the state uses to measure the performance of all other 
students. If the state provides evidence to the Secretary that it 
lacks authority to adopt a statewide system, it may meet the 
Title 1 requirement by adopting an assessment system on a 
statewide basis and limiting its applicability to Title 1 students or 
by ensuring that the Title 1 local educational agency (LEA) 
adopts standards and aligned assessments. 


Include in the assessment system multiple measures of student 
performance, including measures that assess higher order thinking 
skills and understanding. 


Unchanged 


Administer assessments for mathematics and reading in each of the 
following grade spans: grades 3 through 5, 6 through 9, and 10 
through 12. 


Administer reading and mathematics tests annually in grades 
3 through 8, starting in the 2005-06 school year (in addition to 
the assessments previously required sometime within grades 
10 through 12). 

States do not have to administer mathematics and reading or 
language arts tests annually in grades 3 through 8 if Congress 
does not provide specified amounts of funds to do so, but 
states have to continue to work on the development of the 
standards and assessments for those grades. 

Have students in grades 4 and 8 take the National Assessment 
of Educational Progress examinations in reading and 
mathematics every other year beginning in 2002-03, as long as 
the federal government pays for it. 


Assess students with either or both criterion referenced assessments 
and assessments that yield national norms. However, if the state 
uses only assessments referenced against national norms at a 
particular grade, those assessments must be augmented with 
additional items as necessary to accurately measure the depth and 
breath of the state’s academic contents standards. 


Unchanged 


Assess students with statewide, local, or a combination of state and 
local assessments. However, states that use all local or a 
combination of state and local assessments, must ensure, among 
other things, such assessments are aligned with the state’s 
academic content standards, are equivalent to one another, and 


Unchanged 
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Requirements for 1994 


Requirements for 2001 


enable aggregation to determine whether the state has made 
adequate yearly progress 


Implement controls to ensure the quality of the data collected from 
the assessments. 


Unchanged 


Including students with limited English proficiency and with 
disabilities in assessments 


Assess students with disabilities and limited English proficiency 
according to standards for all other students. 

Provide reasonable adaptations and accommodations for students 
with disabilities or limited English proficiency to include testing in the 
language and form most likely to yield accurate and reliable 
information on what they know and can do. 


By 2002-03 annually assess the language proficiency of 
students with limited English proficiency. Students who have 
attended a U.S. school for 3 consecutive years must be tested 
in English unless an individual assessment by the district 
shows testing in a native language will be more reliable. 


Reporting data 


Report assessment results according to the following: by state, local 
educational agency (LEA), school, gender, major racial and ethnic 
groups, English proficiency, migrant status, disability, and economic 
disadvantage. 


Unchanged. 


LEAs must produce for each Title I school a performance profile with 
disaggregated results and must publicize and disseminate these to 
teachers, parents, students, and the community. LEAs must also 
provide individual student reports, including test scores and other 
information on the attainment of student performance standards. 


Provide annual information on the test performance of 
individual students and other indicators included in the state 
accountability system by 2002-03. Make this annual 
information available to parents and the public and include data 
on teacher qualifications. Compare high- and low-poverty 
schools with respect to the percentage of classes taught by 
teachers who are “highly qualified,” as defined in the law, and 
conduct similar analyses for subgroups listed in previous law. 


Measuring improvement 


Use performance standards to establish a benchmark for 
improvement referred to as “adequate yearly progress.” All LEAs and 
schools must meet the state’s adequate yearly progress standard, 
for example, having 90 percent of their students performing at the 
proficient level in mathematics. LEAs and schools must show 
continuous progress toward meeting the adequate yearly progress 
standard. The state defines the level of progress a school or LEA 
must show. Schools that do not make the required advancement 
toward the adequate yearly progress standard can face 
consequences, such as the replacement of the existing staff. 


In addition to showing gains in the academic achievement of 
the overall school population, schools and districts must show 
that the following subcategories of students have made gains 
in their academic achievement; pupils who are economically 
disadvantaged, have limited English proficiency, are disabled, 
or belong to a major racial or ethnic group. To demonstrate 
gains among these subcategories of students, school districts 
measure their progress against the state’s definition of 
adequate yearly progress. 

States have 12 years for all students to perform at the 
proficient level. 


Consequences for not meeting the adequate yearly progress standard 


LEAs are required to identify for improvement any schools that fail to 
make adequate yearly progress for 2 consecutive years and provide 
technical assistance to help failing schools develop and implement 
required improvement plans. After a school has failed to meet the 
adequate yearly progress standard for 3 consecutive years, LEAs 
must take corrective action to improve the school. 


New requirements are more specific as to what actions an LEA 
must take to improve failing schools. Actions are defined for 
each year the school continues to fail leading up to the 5th year 
of failure when a school may be restructured by changing to a 
charter school, replacing school staff, or state takeover of the 
school administration. The new law also provides that LEAs 
offer options to children in failing schools. Depending on the 
number of years a school has been designated for 
improvement, these options may include going to another 
public school with transportation paid by the LEA or using Title 
I funds to pay for supplemental help. 



Source: P. L. No. 103-382 (1994) and Pub.L No. 107-1 10 (2001). 
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state 


Number of tests needed 


Alabama 


17 


Alaska 


9 


Arizona 


9 


Arkansas 


9 


California 


5 


Colorado 


4 


Connecticut 


8 


Delaware 


0 


District of Columbia 


17 


Florida 


0 


Georgia 


0 


Hawaii 


9 


Idaho 


17 


Illinois 


6 


Indiana 


9 


Iowa 


0 


Kansas 


11 


Kentucky 


8 


Louisiana 


8 


Maine 


8 


Maryland 


15 


Massachusetts 


6 


Michigan 


8 


Minnesota 


11 


Mississippi 


3 


Missouri 


8 


Montana 


17 


Nebraska 


11 


Nevada 


11 


New Hampshire 


9 


New Jersey 


10 


New Mexico 


17 


New York 


8 


North Carolina 


3 


North Dakota 


11 


Ohio 


8 


Oklahoma 


8 


Oregon 


6 


Pennsylvania 


11 


Puerto Rico 


10 


Rhode Island 


11 


South Carolina 


3 
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state 




Number of tests needed 


South Dakota 


17 




Tennessee 


15 




Texas 


1 




Utah 


0 




Vermont 


9 




Virginia 


6 




Washington 


8 




West Virginia 


17 




Wisconsin 


17 




Wyoming 


11 





Source: GAO survey. 
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Table 1 1 provides estimates of assessment expenditures states may incur 
for grades and subjects they reported they would need to add to meet the 
additional assessment requirements under NCLBA. These estimates do not 
include any expenditures for continuing development or administration of 
assessments in grades and subjects already included in states’ reported 
assessment program, unless states indicated plans to replace its existing 
assessments. Estimates reflect total expenditures between fiscal year 
2002 and 2008, and are based on the assumptions we made regarding 
question types. 



Table 11: Estimates of Expenditures for the Assessments Required by NCLBA That 
Were Not in Place at the Time of Our Survey, Fiscal Years 2002-08 


Dollars in billions 


Question type 


Estimate 


Questions and scoring methods used 


Multiple-choice 


$0.8 


Estimate assumes that all states use 
machine-scored multiple-choice questions. 


Current 
question type 


$1.6 


Estimate assumes that states use the mix of 
question types they reported in our survey. 


Multiple-choice 
and open-ended 


$2.0 


Estimate assumes that all states use both 
machine scored multiple-choice questions 
and some hand scored open-ended 
questions. 



Source: GAO. 



Note: Projections based on state assessment plans and characteristics and expenditure data 
gathered from 7 states. 
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Table 12 provides test development and nondevelopment expenditures by 
state between fiscal year 2002-08. Test development estimates reflect 
expenditures associated with both new and existing tests. 
Nondevelopment expenditures reflect expenditures associated with 
administration, scoring, and reporting of results for both new and existing 
assessments. 



Table 12: Estimates by State, Development, and Nondevelopment Expenditures 



Dollars in millions 




Multiple-choice and open-ended 


Current question type 


Multiple-choice 






Development 


Non- 

development 


Development 


Non- 

development 


Non- 

Development development 


Alabama 


$16 


$57 


$15 


$15 


$15 


$15 


Alaska 


15 


14 


15 


10 


13 


4 


Arizona 


15 


93 


15 


93 


14 


25 


Arkansas 


14 


39 


14 


28 


13 


10 


California 


13 


619 


12 


223 


12 


166 


Colorado 


13 


74 


13 


74 


12 


20 


Connecticut 


14 


54 


14 


54 


13 


15 


Delaware 


12 


13 


12 


13 


11 


3 


District of 
Columbia 


13 


4 


12 


1 


12 


1 


Florida 


12 


269 


12 


200 


11 


72 


Georgia 


12 


162 


11 


44 


11 


44 


Hawaii 


14 


17 


14 


17 


13 


5 


Idaho 


15 


16 


14 


9 


14 


4 


Illinois 


13 


198 


13 


151 


12 


53 


Indiana 


14 


99 


14 


99 


13 


27 


Iowa 


12 


50 


12 


50 


11 


14 


Kansas 


14 


37 


13 


23 


13 


10 


Kentucky 


14 


58 


14 


48 


13 


16 


Louisiana 


14 


67 


14 


67 


13 


18 


Maine 


14 


19 


14 


19 


13 


5 


Maryland 


16 


75 


16 


75 


15 


20 


Massachusetts 


13 


96 


13 


96 


12 


26 


Michigan 


14 


163 


14 


163 


13 


44 


Minnesota 


15 


76 


15 


76 


14 


20 


Mississippi 


13 


51 


13 


51 


12 


14 


Missouri 


14 


85 


14 


85 


13 


23 


Montana 


16 


13 


16 


12 


15 


3 


Nebraska 


13 


21 


13 


21 


12 


6 


Nevada 


14 


31 


13 


13 


13 


8 


New Hampshire 


13 


18 


13 


18 


12 


5 


New Jersey 


14 


113 


14 


113 


13 


30 
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Dollars in millions 
















Multiple-choice and open-ended 


Current question type 


Multiple-choice 








Non- 




Non- 




Non- 




Development 


development 


Development 


development 


Development development 


New Mexico 


16 


25 


16 


24 


15 


7 


New York 


14 


262 


14 


262 


13 


70 


North Carolina 


13 


139 


12 


37 


12 


37 


North Dakota 


14 


9 


14 


9 


13 


2 


Ohio 


13 


158 


13 


158 


12 


42 


Oklahoma 


14 


53 


13 


24 


13 


14 


Oregon 


14 


57 


13 


15 


13 


15 


Pennsylvania 


15 


166 


15 


147 


14 


45 


Puerto Rico 


14 


56 


13 


15 


13 


15 


Rhode Island 


14 


13 


14 


13 


13 


4 


South Carolina 


13 


73 


13 


70 


12 


19 


South Dakota 


17 


10 


15 


3 


15 


3 


Tennessee 


15 


70 


14 


19 


14 


19 


Texas 


12 


429 


11 


221 


11 


115 


Utah 


12 


50 


12 


33 


11 


13 


Vermont 


15 


10 


15 


10 


14 


3 


Virginia 


13 


116 


12 


48 


12 


31 


Washington 


14 


104 


14 


104 


13 


28 


West Virginia 


17 


26 


16 


7 


16 


7 


Wisconsin 


15 


57 


15 


51 


14 


15 


Wyoming 


14 


7 


14 


7 


13 


2 


Total 


$724 


$4,590 


$706 


$3,237 


$668 


$1,233 



Source: GAO estimates based on state assessment plans and characteristics and expenditure data gathered from 7 states. 
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Table 13 provides estimates for each question type and the benchmark 
appropriations by fiscal years from 2002 through 2008. Each estimate 
reflects assumptions about the type of questions on the assessments. For 
example, the multiple-choice estimate assumes that all states will use 
assessments with only multiple-choice questions. These estimates also 
assume that states implement the assessment plans reported to us. The 
benchmark appropriation is based on actual appropriations in 2002 and 
2003 and on Ae benchmark funding level in NCLBA for 2004-07. We 
assumed a benchmark of $400 million in 2008, the same as in 2005, 2006, 
and 2007. 



Table 13: Estimated Expenditures for Each Question Type, Fiscal Years 2002-08 



Fiscal year (in millions) 

Question 



type 


2002 


2003 


2004 


2005 


2006 


2007 


2008 


Total 


Multiple- 

choice 


$165 


237 


288 


291 


293 


308 


318 


$1,901 


Current 
question type 


$324 


442 


572 


615 


633 


665 


692 


$3,944 


Multiple- 
choice and 
open-ended 


$445 


586 


761 


824 


855 


903 


941 


$5,313 


Benchmark 

appropriation 


$366 


376 


390 


400 


400 


400 


400 


$2,733 



Source: GAO estimates based on state assessment plans and characteristics and expenditure data gathered from 7 states. 



Note: Fiscal years 2002 through 2008 sums may not equal the total because of rounding. 
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UNITED STATES DEPARTMENT OF EDUCATION 

THE UNDER SECRETARY 

April 29, 2003 



Ms. Mamie S. Shaul 
Director 

Education, Workforce, and Income Security Issues 
United States General Accounting Office 
Washington, DC 20548 

Dear Ms. Shaul: 

I am writing in response to the General Accounting Office’s (GAO) draft report, ‘Title I: 
Characteristics of Tests Will Influence Expenses; Guidance May Help States Realize 
Efficiencies.” We appreciate the opportunity to review and respond. 

Problems with Estimating the Costs of Testing: 

While the draft report contains some useful information on the estimated costs of testing in the 
seven States studied, the report goes on to project these estimates on to all other States, which 
makes the report much less valuable and possibly misleading. We are very concerned about the 
inclusion and the weight given to the estimates of costs for each State based on estimates of the 
costs in the particular circumstances of only seven States studied in depth by GAO. In effect, 
this section of the draft report uses multiple levels of assumptions, which results in estimates that 
have the potential to be substantially in error. The GAO report ends up with three specific cost 
estimates for each State that have a ring of authority that we believe is significantly out of 
proportion to the confidence one can place in them. 

While the other forty-five “States” (including the District of Columbia and Puerto Rico) 
apparently responded to survey questions, it does not appear that they provided the level of 
detailed cost information used in the draft report on the costs for each of the States. As the study 
acknowledges, the factors in computing and estimating costs are very specific to the 
circumstances of each State, and cannot be generalized. 

Many factors can affect the costs in different States to make the estimates wrong and misleading. 
For example, the report cites the types of questions included in State assessments as one of the 
main reasons for different costs; yet 48 percent of the States reported that they are uncertain 
about the type of questions they will include on future tests, thus making projected costs in those 
States suspect. The report also cites other factors such as the number of different forms of 
assessments used, and the extent of public release of questions. We believe that there are many 
other factors that may also be crucial, such as the scoring of assessments through outside 
contracts versus the scoring of assessments by in-house staff, the expertise and experience of 
State staff, and many other individual characteristics of a State, including specific characteristics 
of its student population. 



400 MARYLAND AVE., S.W., WASHINGTON. D.C. 20202 

Minvw.ed.gov 

Out mission is to ensum fny '^99 to eduoadon and to promote edueationcd exceUenoe throughout tha Nation. 
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The number of new tests that each Slate reported it would need to develop is found in Table 3 
and used subsequently to estimate development costs by multiplying the number of tests in initial 
development by 3 times the average ongoing development expenditure. The assumption behind 
this calculation is questionable because: a) the GAO reports that they were typically not able to 
separate the costs of new test development from ongoing test development, and b) the costs of 
initial test development will vary tremendously by the nature of the test. For instance, to develop 
an 8*-grade reading test when grades 3-7 already are being tested should be a trivial expense 
compared to developing a science assessment when none currently exists. A more reasonable 
estimate for test development could be derived from the total test development expenditure in the 
seven Slates surveyed, since that includes initial and ongoing test development. 

We also question the draft report’s analysis of question type -- i.e., multiple-choice versus open- 
ended questions — as a key determinant of costs. The draft report fails to differentiate open- 
ended questions that involve short factual answers from open-ended questions that involve 
lengthy writing samples. The costs of the latter will be quite high compared to the costs of the 
former. The draft report also does not consider the proportion of open-ended questions 
employed in an assessment. The functions of open-ended questions can be provided by 
relatively small proportions of such questions compared to multiple-choice questions. By not 
taking into account the nature of open-ended questions, and by not adjusting for the ability of 
States to retain open-ended questions while lowering costs through reducing the proportion of 
such questions, the draft report likely over-estimates substantially the assessment costs in the 
upper two of its three estimates. 

The draft report assumes that costs will be lower in the first few years of test administration and 
will increase in later years when more tests are being developed and administered. One could 
reasonably argue the opposite, that costs are always greater at the outset and that States are likely 
to combine their test development process in a single content area such as reading across grades 
3-8 in the initial years. As a result “out year” costs would be lower. Moreover, GAO projections 
do not take into account the results of increasing competition as more companies enter the 
burgeoning State assessment market. Likewise, no provision is made for advances in 
technology. There are already companies in the market that are capable of administering State 
assessments with handheld computers. Software currently available can score open-ended 
questions. The forces of competition and technology almost surely will drive down costs in the 
development and administration of State assessments. 

Even though the draft report, in a footnote on page 26, acknowledges that the estimates “may be 
biased,” based on the many problems noted in this response, we strongly recommend the deletion 
of the information on estimates of the costs for the States not studied directly. 

We believe the report is also misleading in suggesting that all of the estimated costs arc 
generated by the testing requirements in the No Child Left Behind Act. Educational assessments 
are an inherent responsibility of the States, and many States (and, in many cases, school districts) 
have already developed and administered tests that would meet No Child Left Behind 
requirements. Many of these costs have been borne or would be borne by the States irrespective 
of the No Child Left Behind Act. We think the report needs to make the point that not all of 
these costs are incremental costs generated by the Act. 



o 
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Appendix VII: Comments from the 
Department of Education 



Problems in Indicating the Sources of Federa l Funding: 

In addition, the draft report contains information on sources of Federal funding, but does not, by 
any means, provide a complete picture. For example, it appears to focus on the funding under 
one Federal program under which testing costs are allowable. But it does not include other 
funding sources in the No Child Left Behind Act under which testing costs would also be 
allowable, such as Title I, Part A administrative costs, consolidated administrative costs under 
Section 9201 of the Elementary and Secondary Education Act (ESEA), Title V of ESEA, and the 
additional funds that may be transferred under Title VI of ESEA to various funding sources. 

Thus, it seriously understates available resources provided under Federal programs. 

The Recommendation for Sharing Information: 

We have no problems with the one recommendation contained in the report - that the 
Department of Education use its existing mechanisms to facilitate the sharing of information 
among States regarding assessment development and administration as States attempt to reduce 
expenses. As one example of information sharing already undertaken and facilitated by the 
Department, we suggest that you include in your report information on the Enhanced Assessment 
Grants, as authorized under Title VI of the No Child Left Behind Act. In February 2003, the 
Department awarded $17 million to fund projects aimed at improving the quality of State 
assessment instruments, especially for students with disabilities and students of limited English 
proficiency. In selecting grant recipients, the Department awarded priority points to applications 
submitted by consortia of States. All nine awards went to State consortia, ranging from three to 
fifteen States per consortium. The Department takes very seriously its commitment to provide 
technical assistance to State and local grantees and looks forwards to continuing and enhancing 
its efforts to share information on assessment development and administration to help States 
reduce costs and make the accountability provisions of the No Child Left Behind Act as effective 
and efficient as possible. 

We additionally suggest a change in the title of the report to more accurately reflect the report s 
content. We suggest substituting “Information Sharing” for “Guidance” so that the title reads 
“Characteristics of Tests Will Influence Expenses; Information Sharing May Help States Realize 
Efficiencies.” 

We appreciate the opportunity to provide comments on this draft report, and would be glad to 
work with your office to make the report more reliable and useful. 



Sincerely, 




Eugene W. Hickok 
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GAO’s Mission 


The General Accounting Office, the audit, evaluation and investigative arm of 
Congress, exists to support Congress in meeting its constitutional responsibilities 
and to help improve the performance and accountability of the federal 
government for the American people. GAO examines the use of public funds; 
evaluates federal programs and policies; and provides analyses, 
recommendations, and other assistance to help Congress make informed 
oversight, policy, and funding decisions. GAO’s commitment to good government 
is reflected in its core values of accountability, integrity, and reliability. 
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